Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Table structure recognition model integrating edge features and attention
Xueqiang LYU, Yunan ZHANG, Jing HAN, Yunpeng CUI, Huan LI
Journal of Computer Applications    2023, 43 (3): 752-758.   DOI: 10.11772/j.issn.1001-9081.2022010053
Abstract333)   HTML12)    PDF (2113KB)(199)       Save

Aiming at the problems in the existing methods such as dependence on prior knowledge, insufficient robustness, and insufficient expression ability in table structure recognition, a new table structure recognition model integrating edge features and attention was proposed, namely Graph Edge-Attention Network based Table Structure Recognition model (GEAN-TSR). Firstly, Graph Edge-Attention Network (GEAN) was proposed as the backbone network, and based on edge convolution structure, the graph attention mechanism was introduced and improved to aggregate graph node features, so as to solve the problem of information loss in the process of feature extraction of graph network, and improve the expression ability of graph network. Then, an edge feature fusion module was introduced to fuse the shallow graph node information with the graph network output to enhance the local information extraction and expression abilities of the graph network. Finally, the graph node text features extracted by Gated Recurrent Unit (GRU) were integrated into the text feature fusion module for edge’s classification and prediction. Comparative experiments on Scientific paper Table Structure Recognition-COMPlicated (SciTSR-COMP) dataset show that the recall and F1 score of GEAN-TSR are increased by 2.5 and 1.4 percentage points, respectively in comparison with the existing optimal model Split, Embed and Merge (SEM). Ablation experiments show that all the indicators of GEAN-TSR have achieved the optimal values after using the feature fusion module, proving the effectiveness of the module. Experimental results show that GEAN-TSR can effectively improve the network performance and better complete the task of table structure recognition.

Table and Figures | Reference | Related Articles | Metrics
Text multi-label classification method incorporating BERT and label semantic attention
Xueqiang LYU, Chen PENG, Le ZHANG, Zhi’an DONG, Xindong YOU
Journal of Computer Applications    2022, 42 (1): 57-63.   DOI: 10.11772/j.issn.1001-9081.2021020366
Abstract1407)   HTML72)    PDF (577KB)(1235)       Save

Multi-Label Text Classification (MLTC) is one of the important subtasks in the field of Natural Language Processing (NLP). In order to solve the problem of complex correlation between multiple labels, an MLTC method TLA-BERT was proposed by incorporating Bidirectional Encoder Representations from Transformers (BERT) and label semantic attention. Firstly, the contextual vector representation of the input text was learned by fine-tuning the self-coding pre-training model. Secondly, the labels were encoded individually by using Long Short-Term Memory (LSTM) neural network. Finally, the contribution of text to each label was explicitly highlighted with the use of an attention mechanism in order to predict the multi-label sequences. Experimental results show that compared with Sequence Generation Model (SGM) algorithm, the proposed method improves the F value by 2.8 percentage points and 1.5 percentage points on the Arxiv Academic Paper Dataset (AAPD) and Reuters Corpus Volume I (RCV1)-v2 public dataset respectively.

Table and Figures | Reference | Related Articles | Metrics